xor problem
PReLU: Yet Another Single-Layer Solution to the XOR Problem
Pinto, Rafael C., Tavares, Anderson R.
The XOR problem has traditionally been used to illustrate the limitations of single-layer networks since Minsky and Papert's seminal work [8], which even contributed to the first AI Winter [12]. It has traditionally required at least one hidden layer to solve, making it a litmus test for network complexity. Trivially, any function, no matter how complex, can be learned in a single layer by just using itself as the activation function, and that says nothing about its general applicability and usefulness. Here, however, we reveal this ability in a simple, general and well-established activation function. This study demonstrates how using the Parametric Rectified Linear Unit (PReLU) activation [4] overcomes these limitations, effectively solving the XOR problem without additional layers. This ability has significant implications for neural network design and efficiency, potentially leading to simpler architectures for complex problems. On another front, recent advancements in neuroscience have revealed that individual human neocortical pyramidal neurons can learn to compute the XOR function [3]. This discovery has inspired new artificial neuron models and activation functions that aim to bridge the gap between biological and artificial neurons [9]. Albeit not producing the same activation curves as the ones found in biological neurons, the PReLU activation matches their representational power, at least regarding the XOR function.
488 Solutions to the XOR Problem
A globally convergent homotopy method is defined that is capable of sequentially producing large numbers of stationary points of the multi-layer perceptron mean-squared error surface. Using this al(cid:173) gorithm large subsets of the stationary points of two test problems are found. It is shown empirically that the MLP neural network appears to have an extreme ratio of saddle points compared to local minima, and that even small neural network problems have extremely large numbers of solutions.
Loss Surface Modality of Feed-Forward Neural Network Architectures
Bosman, Anna Sergeevna, Engelbrecht, Andries, Helbig, Mardรฉ
It has been argued in the past that high-dimensional neural networks do not exhibit local minima capable of trapping an optimisation algorithm. However, the relationship between loss surface modality and the neural architecture parameters, such as the number of hidden neurons per layer and the number of hidden layers, remains poorly understood. This study employs fitness landscape analysis to study the modality of neural network loss surfaces under various feed-forward architecture settings. An increase in the problem dimensionality is shown to yield a more searchable and more exploitable loss surface. An increase in the hidden layer width is shown to effectively reduce the number of local minima, and simplify the shape of the global attractor. An increase in the architecture depth is shown to sharpen the global attractor, thus making it more exploitable.
Passive nonlinear dendritic interactions as a general computational resource in functional spiking neural networks
Stรถckel, Andreas, Eliasmith, Chris
Nonlinear interactions in the dendritic tree play a key role in neural computation. Nevertheless, modeling frameworks aimed at the construction of large-scale, functional spiking neural networks tend to assume linear, current-based superposition of post-synaptic currents. We extend the theory underlying the Neural Engineering Framework to systematically exploit nonlinear interactions between the local membrane potential and conductance-based synaptic channels as a computational resource. In particular, we demonstrate that even a single passive distal dendritic compartment with AMPA and GABA-A synapses connected to a leaky integrate-and-fire neuron supports the computation of a wide variety of multivariate, bandlimited functions, including the Euclidean norm, controlled shunting, and non-negative multiplication. Our results demonstrate that, for certain operations, the accuracy of dendritic computation is on a par with or even surpasses the accuracy of an additional layer of neurons in the network. These findings allow modelers to construct large-scale models of neurobiological systems that closer approximate network topologies and computational resources available in biology. Our results may inform neuromorphic hardware design and could lead to a better utilization of resources on existing neuromorphic hardware platforms.
Logistic Regression from scratch (and how to make it nonlinear)
Logistic Regression is a staple of the data science workflow. Below, I show how to implement Logistic Regression with Stochastic Gradient Descent (SGD) in a few dozen lines of Python code, using NumPy. Then I will show how to build a nonlinear decision boundary with Logistic Regression by using feature crosses. Here is the repo with the full code shown below. Although, in many applications Logistic Regression has been replaced by more advanced techniques such as ensemble tree-based methods (like gradient boosting) or by deep neural networks. However, it is still commonly used due to its simplicity and interpretability.
Artificial Neural Networks โ Part 2: MLP Implementation for XOr
As promised in part one, this second part details a java implementation of a multilayer perceptron (MLP) for the XOr problem. Actually, as you will see, the core classes are designed to implement any MLP implementation with a single hidden layer. First, it will help to introduce a quick overview of how MLP networks can be used to make predictions for the XOr problem. For a more detailed explanation, please review part one of this post. The image at the top of this article depicts the architecture for a multilayer perceptron network designed specifically to solve the XOr problem.
Robustness of classification ability of spiking neural networks
Yang, Jie, Zhang, Pingping, Liu, Yan
It is well-known that the robustness of artificial neural networks (ANNs) is important for their wide ranges of applications. In this paper, we focus on the robustness of the classification ability of a spiking neural network which receives perturbed inputs. Actually, the perturbation is allowed to be arbitrary styles. However, Gaussian perturbation and other regular ones have been rarely investigated. For classification problems, the closer to the desired point, the more perturbed points there are in the input space. In addition, the perturbation may be periodic. Based on these facts, we only consider sinusoidal and Gaussian perturbations in this paper. With the SpikeProp algorithm, we perform extensive experiments on the classical XOR problem and other three benchmark datasets. The numerical results show that there is not significant reduction in the classification ability of the network if the input signals are subject to sinusoidal and Gaussian perturbations.
Learning Neural Networks Using Java Libraries - DZone AI
This article is featured in the new DZone Guide to Artificial Intelligence. Get your free copy for more insightful articles, industry statistics, and more! As developers, we are used to thinking in terms of commands or functions. A program is composed of tasks, and each task is defined using some programming constructs. Neural networks differ from this programming approach in the sense that they add the notion of automatic task improvement, or the capability to learn and improve similarly to the way the brain does.
Deep Learning: Recurrent Neural Networks in Python
Like the course I just released on Hidden Markov Models, Recurrent Neural Networks are all about learning sequences - but whereas Markov Models are limited by the Markov assumption, Recurrent Neural Networks are not - and as a result, they are more expressive, and more powerful than anything we've seen on tasks that we haven't made progress on in decades. So what's going to be in this course and how will it build on the previous neural network courses and Hidden Markov Models? In the first section of the course we are going to add the concept of time to our neural networks. I'll introduce you to the Simple Recurrent Unit, also known as the Elman unit. We are going to revisit the XOR problem, but we're going to extend it so that it becomes the parity problem - you'll see that regular feedforward neural networks will have trouble solving this problem but recurrent networks will work because the key is to treat the input as a sequence.